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ABSTRACT 

This study (the fir^^t part of a two-part study) 
analyzed the effectiveness of three mandatory welfare employment 
programs in serving different segments of the Aid to Families with 
Dependent Children (AFDC) caseload. Data were collected in 
evaluations of welfare employment initiatives in San Diego, 
Baltimore, and several counties in Virginia. Program participation 
was required for different portions of the AFDC caseload. Eligible 
applicants and recipients (primarily female) were randomly assigned 
to experimental groups, which received program services, or to 
control groups, which did not. Data were collected using AFDC 
payments and Unemploymenu Insuram^e earnings records. With few 
exceptions, employment and earnings impacts were consistently smaller 
than average for the welfare applicants and recipients who had the 
best work records and the least prior welfare experience. The impacts 
were usually larger for more dependent individuals, although not for 
the cases that were the most dependent. Programs had less consistent 
impacts on subgroups defined by characteristics such as marital 
status and educational level. The outcome measures examined were not 
valid indicators of program performance. Neither job entries no^ 
cases off welfare were a satisfactory predictor of the changes in 
employment, earnings, and welfare receipt achieved by tne programs 
studied. (20 references) (YLb) 
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This report presents a preliainary analysis of the effectiveness of 
three aandatory welfare eoployment prograos in st ying different segments 
of the Aid to Families with Dependent Children (AFDC) caseload. The 
analysis, covering the first phase of a two-part study, has been undertaken 
to obtain two kinds of inforaation that are useful in designing and 
operating such programs. One is estimates of the programs' relative 
impacts on the eoployTnent and welfare receipt of different groups of 
welfare recipients. These inpacts nay indicate groups to which program 
services can best bte targeted in order to use funds efficiently. The other 
is the developaent and validation of short-term performance indicators, 
which are important in Judging these programs' performance in meeting their 
long-term objectives of increasing employment and reducing welfare 
dependency. 

The analysis is based on data collected in evaluations of wel far? 
employment initiatives in San Diego, Baltimore and several counties in 
Virginia. Participation in the prograos was required for diffi'-ent 
portions of the AFDC caseload who are "mandatory" under federal Work 
Incentive (WIN) Program regulations. The programs also provided different 
services and operated in different labor markets. 

The populations served and the three programs' services are as 
follows. The San Diego program required the participation of ,-w AFDC 
applicants in a three-week Job search workshop. These who did not find 
Jobs during this time were then assigned to a 13-week work experience 
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position m a public or nonprofit agency. Ssltisore^s program required 
both applicants and newly-mandatory recipients to participate, but 
activities could be selected from a nuir^ber of job search, work experience, 
education and training options. In Virginia, the program required Job 
search of the entire WDI •mandatory caseload, which was sometimes followed 
by work experience, education or training. This program dirfered frets the 
others in that it operated in rural as well as urban areas of the state. 

Each of the three evaluations used experimental research designs to 
estimate program impacts* Eligible applicants and recipients were randomly 
assigned to experimental groups, which received program services, or to 
control groups, which did not. (It should be noted that an applicant for 
welfare at the time of random assignment was called ar. ^applicant through-- 
out the study, even though many became recipients* ) The experience of the 
control groups — which could have received services from sources other 
than the programs — Indicates what would have happened to the experimental 
groups in the absence of the programs, providing a benchmark against which 
to measure program impacts. 

Data were collected using AFDC payments and Unemployment Insurance 
earnings records for varying periods of up to three years in San Dlegc and 
Baltimore; only a short follow-up period was available in Virginia. The 
data considered in this analysis are for single-parent ^'primarily female) 
heads of households. Two-parent households (primarily men eligible under 
the AFDC-Unemployed Parent program) were included in two of the program 
evaluations, but are not part of this study^s sample. 

The distinction between "outcomes" and "impacts^ underlies most of the 
findings of this analysis. An ^^outcome'^ is the employment or welfare 



status of a person at a specified point after program enroUnsent. An 
•^iopact" is the char-ge in outcones produced by a prograju during that 
period, or slaply the outcome difference between the experioental and 
control grc'.pa. Prograo iapacta are asaaller than outcomes because the 
normal Job-finding and welfare departure rates of the AFDC population — 
i.e., the control group's level — are not zero in the absence of a 
program. Past research, however, has indicated that groups exhibiting 
worse-than-average outcomes may generate better-than-average impacts. 

Subgroup Impapt Dif ff.r*.no^7 

The analysis focuses on female WIN-mandatory AFDC subgroups defined by 
two characteristics: prior work and welfare history. The samples were 
divided into subgroup categories according to simple objective aeaaures of 
job-readiness and welfare dependence at the time these individuals became 
eligible for the program (i.e., were randomly assigned). Three subgroups 
were based on earnings from employment in the year prior to random assign- 
ment: no earnings, $1 to $2,999, or $3,000 or more. Similarly, three other 
subgroups were created according to the length of time that these people 
had been on welfare (that is, had had their own AFDC case) before random 
assignment: never, two years or less, or more than two years. 

Other characteristics, such as marital status and prior education, 
were also examined, but their role in determining impacts was not as 
consistent across programs as the prior earnings and welfare dependency 
measures . 

• When subgroups we e'efined by previous work and welfare 

experience, the jOb-ready and least wel fare-dependen' 

groups had below-average program impacts, which were often the 
smallest Impacts. 
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With few exceptions, employment and earnings lapacta were consistently 
siaaller than average for the welfare applicants and recipients who had the 
best work records and the least prior welfare experience. frequently, 
program iopacts on these groups were the smallest. This does not mean that 
thu more Job-ready and less dependent people who enrolled In these programs 
were 'ess able to find jobs than those with poor records. In fact, as one 
would expect, these people entered employtnent tsuch more frequently. But 
control group members with the better work records or no welfare experience 
also found enploytcent almost as easily, so program interventions made less 
of a difference with these groups. 

This point is demonstrated in Table 1, which shows composite estimates 
from the separate samples of AFDC applicants and recipients in the three 
programs analyzed* These estimates should be interpreted with care because 
they do no^^ show the underlying variation across programs. Nevertheless, 
the table indicates that xperimentals with 43,000 or more in earnings in 
the pre-program year — the highest earnings category — achieved an 
average employment rate of 62 percent per quarter. At the same time, 
experiaentals who had not worked at all in the year prior to program 
enrollment had only a 26 percent employment rate. Yet the employment 
impact for the first group was somewhat below the average, at 3,1 
percentage points. The ''less employable" group attained a U,9 percentage 
point gain, the highest of the three prior-earnings categories. 

Similarly, individuals who had never had an AFDC case in the past 
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achieved above-average essployiaent rates, but experienced virtually nc 
:.iapacts in eniployinent and earnings, while thoae with extensive prior 
welfare experience had lc?wer employiBent rates but showed larger etnployrnent 
gaina . 

Progriffl impacts on welfare incidence and the amount of wel fare 
payraents were soaller than impacts on employment and earnings. Composite 
estimates are shown in Table 2, Again, sample members with high earnings 
and low prior welfare receipt often showed relatively smaller welfare 
impacts, although the overall pattern was less consistent than for employ- 
ment impacts. 

e The impacts were usually larger for more dependent 
individuals, although not for the cases that were most 
dependent. This suggests that some program models may operate 
most effectively with individuals above some threshold level 
of employability. 

While the impacts of the three programs on employment and welfare were 
often larger for the more dependent segments of the AFDC caseload, this was 
not uniformly true. For example, the impacts for recipients in both 
Baltimore and Virginia — who, by definition, had been on welfare for a 
period of time — were substantially smaller than for applicants. In fact, 
the applicant impact on quarterly earnings was about three times the size 
of the recipient impact. 

These findings suggest tliat the relationship between individual 
dependency and program impacts is not linear. In Figure 1, estiisates of 
the San Diego and Baltimore program impacts on earnings were plotted 
against an estimated dependency score for each individual ^ reflecting 
predicted welfare use and earrings' based on prior work, welfare experience 
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and other characteristics, such as number of children and educational 
attalnaenc. 

In Baltiaore — where the program worker? with cases over a wide range 
of dependency — clients with the highest prodicted welfare receipt and 
lowest earnings (the far left-hand side of Figure 1) did not appear to 
benefit fron the particular services offered. Abo/e some threshold level 
of dependency, the impact on earnings increased. But at the other end of 
the spectruiB (the far right-hand side of the figure), the program again had 
less effect. The relatively job-ready WIN-oandatory caseload seemed better 
able to enter employment and leave welfare without program help. Thus, 
this program had its greatest effect on the large block of enrol lees in the 
middle. 

San Diego served a le.-^s dependent population -- only the AFDC 
applicants. Again, impacts are smaller for those at either end of the 
dependency spectrum (see the bottom graph in Figure 1), although San 
Diego's curve is less pronounced than Baltimore's one. 

Dependency impact profiles should be developed for more than Just two 
pr-graos before final conclusions are drawn. Different service models may 
produce different profiles. For example, program services planned 
especially for highly dependent Individuals (such as supported work) or for 
relatively Job-ready individuals (such as Job placement a;,sl3tance ) may 
have very different contours. It may also be important to consider program 
performance in relationship to the different labor markets. Judging frcm 
the liEited results available thus far from rural counties in Virginia — 
where the economic conditions were very different from those in San Diego 
in Baltimore — program experience did not appear to fit the pattern of 



Figure U 

# The prci;rams had less consistent Impacts on subgroups of the 
WIN-manciatory AFDC caseload who vere defined by character- 
istics such as marital status and educational level. 

While many factors together may contribute to the impact results, 
single character tics other than prior earnings and welfare history did 
not generally pro<?i>2e consistent impact differences in this study. For 
example, the nixmber of children in a household, which is related to welfare 
dependency, is not alone a consistent explanatory variable for impacts. 
This was also true for variables such as race and marital status. Prior 
employment and welfare receipt are thus the most important characteristics 
to consider when trying to improve the results of welfare employment 
programs for WiN-mandatory individuals. 

However, some of the other characteristics were important in specific 
program settings. For example, a higher level of education was positively 
related to impacts in the San Diego program, which did not offer 
educational services and was designed to move people into the labor sarket 
quickly. That was not true in Baltimore, which did offer remedial 
education services. 

Program Performance Measures 

While program performance should be ideally assessed in terms of 
impacts, nly short-term outcome measures such as '^jcb entries^ 
(placements) and cases ^'of f-wel fare" (case closures) are available in most 
instances. This subgroup impact analysis suggests not only that these 
measures overstate impacts, but that they also misrepresent the relative 



j 

perfonnance of certain subgroups of welfare recipients. Thus, unless 

subgroup differences are taken into account, current perfonnsnce measures 

may be sending the wrong signals to program administrators about the groups 

who should be receiving pri^ '-ity for prograo services. It is important to 

note, however, that other measures ~ such as average entry wage leveis — 

could not be addressed in this analysis. 

• Unadjusted *job entry** and ^•of f-welfare^ measures were poorly 
correlated with the eisployiaent and welfare impacts of the 
programs in San Diego and Baltimore. Hence, these measures by 
themselves are not good indicators of program performance. 

The relationship between outcomes and program impacts was examined by 
estimating impacts for each member of the experimental group and then 
determining the correlation of these estimates with the outcome measures. 
The conclusion is that the outcome measures examined were net valid 
indicators of impacts. Neither job entries nor cases off-welfare were a 
satisfactory predictor of the changes in employment, earnings and welfare 
receipt achieved by the programs studied. These findings remained true 
when differential program costs were considered* 

This conclusion — which runs counter to common wisdom simply 
reflects the fact that the magnitude of the program effect on finding a job 
or leaving welfare is greater for some groups of individuals than others. 
This does not imply that programs should stop trying to help all people in 
the caseload find Jobs and leave welfare. It does mean that Judging 
programs on the basis of these outcome measures — without considering 

differences in caseload characteristics and economic conditions is 

unwise. It is quite possible, for ex?-aple, for a program with a relatively 
low placement rate in a poor labor market to have greater Impacts than 
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mother program with a more job-ready caseload and more placements. The 
analysis also 3hows that this conclusion does not change when longer-terts 
eajployis^nt rates are substituted for ismediate job entries. 

• Weighting performance measures to reflect pr-^or work histories 
greatly improved their value in predicting program impacts. 

Giving more weight to Job entries and movement off welfare by cases 
with no or limited etaployment experience improved the correlation between 
the performance measures and earnings impacts. One welg, ting scheme tested 
gave four points for a job entry by a person not employed in the previous 
year and two points or one point to people who had some pre-program 
earnings $1 to $2,999 or $3,000 or more. A number of other weighting 
procedures were tried, and some of the*e were also an improvement over 
unweighted indicators. 

# Like job entries and welfare case closure^t simple program 
participation measures can give a misleading impression of 
program performance. Weighted participation or ''program 
coverage** measures — while more difficult for program 
operators to use — may be better suited to assessing the 
performance of mandatory welfare employment programs. 

Performance measures based on participation — that is, active enroll- 
ment in program services or activities — are sometimes used in addition to 
Job entry and welfare outcome measures. Participation standards can be 
important because they have the advantage of encouraging program operator? 
to serve a broad range of those eligible. However, these measures also 
have octne drawbacks, especially for mandatory welfar^e employment programs, 
which have sanctions that reduce welfare grants for individuals who do not 
cooperate with participation requirements. These programs intentionally 
attempt to affect the behavior of nonparticipants as well' as participants. 



Moreover, because this analysis suggeata that unveishted participation 
ffleaaures aay misrepresent any program effectiveness that is related to 
Participation, priorities or weighting acheoes for AFDC subgroups should be 
considered if these measures are to be used. 

Program "coverage" measures provide a possible alternative, although 
so far they have only been used as an analytical tool in program evalu- 
ation. In measuring the number of people covered by a program, a broader 
viev of progi*am contact is talcan than Just participation. The number of 
cases in which participation is no longer required — because someone 
becomes employed Oi- leaves AFDC on his or her own — as well as those in 
which sanctions for noni^rticipation have been imposed, are counted in 
addition to cases of participation. The proportion of '^uncovered" cases 
directs attention to the group the program has not reached — that is, the 
individuals who are still on welfare, unemployed, have not begun to satisfy 
program requirements, or have not been sanctioned for noncompliance. 

Such measures, however, are generally not used at present and have a 
number of practical limitationij, including extensive and potentially 
expensive changes in data collection procedures. 

Conclusions and O pen Issues 

The research reported in this document addresses a number of important 
issues in the monitoring and targeting of welfare employment programs. It 
also raises questions relevant to the broacJer smpioytsent and training 
delivery system. While the results to date are striking and suggest the 
promise of further research, they should be considered preliminary and 
suggestive, rather than definitive. In some cases, the implications are 
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quite clear. But in others, they raise questions to which the appropriate 
policy response is less clear. 

For example, a convincing lesson from this study is that, if resources 
are Halted, it is a mistake to concentrate only on serving the most 
Job-ready portion of the AFDC caseload. Since this was the tendency in the 
WIN program, this is an iasportant message and suggests a shift in strategy. 
Thus, perfonaance measures should b# revised to encourage programa to work 
with Bore dependent and less job-resdy individuals. Unweighted outcome 
measures clearly do not do this. Weighted measures create more appropriate 
incentives by explicitly taking subgroup differences into account. Many 
are already recognizlag this lesson and adopting measures to try to adjust 
service priorities and monitoring tools. 

On the other slde^ readers should be cautioned that the results do not 
yet suggest an exclusive focus on the more disadvantaged or the immediate 
adoption of one particular weighting scheme* These cautions are suggested 
by several factors. First, while impacts were 3maller for the more 
Job-^ready, they were sometimes positive. Second, and more important, the 
results reported in this analysis were for programs that made no targeting 
choices and thus mixed in Job clubs, placement efforts and other activities 
*for individuals with a wide range of prior work experience and other 
^employsbil^ ty*' factors. This study thus cannot say whether similar 
positive impacts for the more disadvantaged could be obtained by programs 
that served only such groups. 

One could well imagine, for example, that including the more Job-ready 
in job search workshops helped motivate both program staff and the most 
disadvantaged and thus contributed to the positive results reported here. 




This '^mainatreaaing'^ hypothesis is not tested in this study, but it 
suggests that administrators should look carefully at the operational 
results of more targeted services before exclusively using resources for 
this group. In addition^ worlcing only with individuals with lower skills 
and measured outcomes could have political, adsinistrative or stigaatixing 
effects. For exaiaple, it may be difficult to convince people that a 
placement rate of 30 percent represents a substantial positive achievement. 
Such low rates may also discourage staff efforts* And, employers may think 
differently about a work program that refers only clients with no prior 
work history. 

The results of this study are most convincing when they suggest not 
serving only the most job-ready but rather se^^i^ig a broad range of the 
caseload, with differential rewards or monitoring structures. They do not 
yet confirm exclusive targeting. 

Finally, the weighting schemes examined in this analysis were only 
tested in two programs. It wjll be important to see whether their 
advantages hold up with different groups and programs in different states 
before a particular formula is adopted. Thus, while the results to date 
indicate possible directions to go in improving program monitoring, they do 
not prescribe a formula that would be valid in a wide range of economic, 
demographic and programmatic conditions. 

In addition, readers should be aware of a number of caveats and open 
question;^;. First, the results presented here come from mandatory programs 
enrolling everyone within a specified group of welfare recipients. Very 
different issues and lessons could arise in selective program that can 
choose the people they wish to enroll. Program operators, for example, 
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could screen intensely among the sore disadvantaged, possibly identifying 
only the sost motivated within this group^ and thus undercut the very 
message implicit in the results reported here. 

A major open question arises from the preliminary finding that at 
least for the relatively inexpensive and often non-intensive services 
studied there may be a threshold effect: i^e., impacts may be sisaller 
for the most dependent persons^ It will be important to examine whether 
this is also true for programs that provide more intensive services. 
Notably, can programs offering Intensive educational remediation or 
long-term education and skills training change the shape of the impact 
curve in Figure 1 and succeed in increasing the earnings of the most 
disadvantaged? Results from another atudy — Supported Work — suggest 
that at least that treatment had substantial benefits for some member's of 
this group. 

Finally, performance measures are only useful if they can be 
implemented; the data must be available and the calculations possible to do 
in a reasonable period of time* The analysis in this report drew on an 
unusual data base, which is not readily available to program 
administrators. It will be important to examine the feasibility and cost 
of adopting some of these measures* 

Some of these questions require further operational experience, and 
some go beyond what can be learned from the programs included in MDRC^s 
study, ethers will be addressed during the second phase of the planned 
research, drawing on the large knowledge base of this study and on the 
promising directions seen so far* 
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CHAPTSH 1 
IKTHOPnCTION 



The search for valid and workable standards of performance to be used 
in employment progrsoa for welfare recipients has been one of the major 
theoes in current efforts to reform welfare policy. Such close attention 
is warranted becaus* perfonsance measures are one of the prioary means by 
which broad policy is translated into the specific objectives that guide 
the operations of prograoa. Standards allow administrators to assess how 
well their prograas are doing, to evaluate the worth of innovative 
programs, and to identify probleos in existing aodels. They can also 
influence the programs' service priorities, encouraging a focus on the 
welfare groups most likely to help the programs achieve a high performance 
rating. In this manner, standards also influence the allocation of funds 
and, in a period of fiscal restraint, it is important that perfonnance 
measures promote efficient utilization of resources. 

Given this importance, perfonnance oeasures should be appropriate for 
the programs that use them. Poorly designed or inadequately tested 
performance standards can work against the objectives of the authorizing 
legislation. They can waste staff time and other program resources, with 
the result that neither the welfare population nor society ia well served. 

This paper examines performance monitoring by studying three 
employment and training programs for recipients of Aid to Families with 
Dependent Children (AFCC) programs in which participation was mandatory. 
It is the first phase of a twc-part investigation into the differences in 



the impacts of such programs on the eLployment and welfare receipt of 
selected AFDC subgroups. The study uses data from the Detscnstratlon of 
State Work/Welfare Initiatives, a five-year, eight-state series of large- 
scale social experiments conducted by the Manpower Demonstration Research 
Corporation (MDRC). The data are unusual in that the research samples they 
describe were generated in controlled experiments involving random assign- 
mnt. They are also comprehensive enough to permit program performance to 
be considered in tenas of multiple program effects as well as program 
costs. Complete subgroup analyses are presented for two programs, with a 
preliminary analysis offered for the third program, which has limited 
follow-up data at this time. 

It should be emphasized that all of the programs were targeted to AFDC 
cr^se heads meeting the Work Incentive (WIN) program definition of manda- 
tory: single parents (mostly women) who had no child under the age of six, 
and had no other known barriers to participation. This so-called WIN- 
mandatory group makes up about one-third of the AFDC caseload nationwide. 
Unemployed heads of two-parent households, who are also mandatory, were 
part of the research in the states that served this group, but these 
samples have been excluded because they are primarily men, with different 
work backgrounds, and receive assistance under diiTer-»nt rules. 

This analysis uses the subgroup impacts generated ■ from the experiment* 
al data on the three program:^ to evaluate the validity of two frequently 
used performance measures: the number of "job entries^' (placements) and the 
number of cases "off-welfare^ (case closures) Some alternative standards 
are also considered* However, the implications of the snalysis are 
somewhat broader in scope than welfare employment programs because tnary cf 



the is3u©a examined ar^ comiaon to othtr prograas for low-tncome or disad- 
vantaged groups, such as those funded by the Job Training Partnership Act 
(JTPA). This study *s focus on testing employment and welfare receipt 
aeasures should not iisply that measures such as wage rates, Job retention 
and participation have no validity for some programs. 

The discussion is structured as follows* This chapter reviews issues 
relevant to welfare population subgroups and program performance. Chapter 
2 discusses the welfare employment programs studied and their research 
designs, and is followed in Chapter 3 an explanation of the sethodoiogy 
used to e-txmate subgroup impacts and to test performance indicators^ 
Chapters 4 and 5 are central to the analysis* Chapter 4 presents impacts 
and costs for the major subgroups in the study, while Chapter 5 evaluates 
the validity of alternative performance measures, using program impact 
estimates. Since this is the first phase of the study, no conclusions are 
as yet offered. 

A. Issues in Assesain^g Program Performance Monitoring 

Performance measures are intended to promote program effectiveness, 
conserve resources, and ensure compliance with overall \1^ and direct- 
ives. A wide range of indicators has been developed and used in the WIN 
program, those funded by the Comprehensive Employment and Training Act 
(CETA), and, more recently, by the Job Training Partnership Act. Histori- 
cally, Job placements and welfare reductions have been the most important 
indicators in WIN. These measures have seemed useful in conveying program 
achievements in straigntf orward terms to policymakers and the general 
public Their incorporation into the fiscal WIN Allocation Forauia under- 



lined their 3ignificance to operators of welfare employment progrsuas* 
Other indicators, hovaver, were also part of the WIN Allocation Fonnula, 
such a3 the quality of Job entries, usually measured by nrage rates and Job 
retention. 3 In sseasuring employment outcomes, all enrollees have been 
counted with equal veight» 

These indicators all measure the outcomes of a registrant's program 
experience at some point after registration. Another set of indicators 
looks at the activity of registrants while Jji the program; these include 
counts of registrants, participants, program completors tnd similar 
measures. Participation data have been examined in evaluations of WDI, 
CETA and other programs^ but the trend today has been to deemphasize these 
indicators, even though they provide immediate feedback and are relatively 
inexpensive to collect.^ Instead, emphasis has been on measures that 
coBfflunicate program goals in terms of post-program outcomes. For example, 
the JTPA legislation explicitly requires that standards for adult partici- 
pants be based on Job entries, wages and earnings, retention and welfare 
reductions. 5 

1 • Outcomes and ImnaQts 

The distinction between ^outcomes" and '♦impacts" is critical to an 
understanding of how well outcomes measure program performance. An outcome 
is the employment and/or welfare status of a person at some point in time 
after prograiii registration. Hence, the outcome ^employed and not receiving 
welfare at quarter a" describes the status of a person 9 to 12 montns after 
program entn'. 

The real effects of a program cannot be Judged by outcomes, however, 
given the high degree of normal Job-finding and welfare departure within 




the welfara population. Program impacta, in contrast, do state the true 
prograas affects — if thay have been correctly estimated. Itapacts measure 
a change in behavior, one that can be estimated by conparin^ the behavior 
of a group of people who receive the program treatment with that of a 
similar group of people who do not receive the treatment: i.e., a control 
group, the behavior of which in the three programs studied is discussed in 
Chapter 3. The distinction between a level — the outgone — and a change 
— • the iagagl — is important because program impacts are likely to be far 
smaller than program outcomes, since factors such as the control group's 
employment rate are not zero in the absence of a program. 

Past research has suggested that groups exhibiting worse- than-average 
outcooea may, in fact, experience better- than-average program impacts. For 
example, an evaluation of a Job search and work experience program operated 
in San Diego found that 73 percent of WUJ-mandatory AFDC applicants who had 
worked at some time during the year prior to their program entry were able 
to find employment during the year and one-half following enrollment. This 
high rate was, in fact, only a 2 percentage point change froo the control 
group employment level -- that is, the rate that applicants with a prior 
work history were able to achieve on their own. In contrast, enrollees 
without prior employment attained only a 48 percent employment rate, but 
this outcome was a 10 percentage point increase, or impact, from the 
control group's employment rate of 38 percent.^ 

Giver, these patterns, performance indicators based only on outcomes 
create a misleading izspression of program effectiveness. Clearly, they 
overstate program impacts because the measures have no comparison against 
which to Judge change. However, a problem more serious than simple 



overstatement say € ^.st. Prograci re?*c'irces may be ineffectively targeted 
If these standards place ejaphasia on serving the least appropriate groups 
— that is, those who would have done well on their own, without the 
programs* Conversely, people who could benefit most from these programs 
may be underserved. The important role of performance measures in deter- 
mining how prograas are operated and how resources are allocated is the 
principal reason that this examination of current performance measures has 
been undertaken. 

The findings in this paper and similar ones from other studies suggest 
that consideration be given to the development of performance formulas that 
do not treat each person's outcome equally. Such formulas allow outcome 
standards to vary by local economic conditions, registrant characteristics, 
and even by service components. Regression adjustment is one way to 
develop formulas that permit more flexible performance standards for pro- 
grams serving groups with a low likelihood of finding employment readily, 
or those operating in areas with relatively poor labor markets, where it is 
hard to find Jobs, In such plans, performance weights are based on many 
background variables, such as prior work experience, the length of welfare 
dependency, education and number of children. 

Multiple regression formulas have advantages, but they can be complex* 
They may also be more suited to analysis at the aggregate level than for 
the communication of program objectives to local staff or in setting 
performance criteria for service subcontractors. Moreover, the correct 
regression weights may not have been used in the past, many having been 
based on outcome levels rather than estimated impacts. 

This study presents some simpler weighting options, which take one, or 
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perhaps two, characteristics into account Instead of many. Prior 
etapioyment Is one important characteristic for VIN-manda tcry AFDC women, as 
this report will show^ However, this study — • while searching for better 
ways to judge prograsi success — makes no pretense of having all of the 
answers, for the goals of some programs may not be easily translated into 
sljspie weighting schemes. 

2. la^uftg in Targeting 

Much of the recent work in targeting welfare employment programs has 
focused on AFDC subgroups outside the WIN-mandatory category — such as 
mothers with young children, who are not part of this study. This ongoing 
research has tried to identify subgroups to whom services should be 
targeted because they are likely to have relatively long periods of welfare 
dependency,'^ The basic premise is that the longer the predicted period of 
dependency, the greater the potential reduction in dependency that program 
services can produce. A key assumption is that treatments can be found 
that would work effectively with the most dependent subgroups. 

These studies have successfully linked differences in length of 
welfare dependency with measured recipient characteristics. One important 
finding has been that the sajority of people who enter the welfare system 
spend less than four years on the rolls, even counting repeat spell s» 
Services targeted to this group, it is argued, may not be an efficient use 
of resources* The smaller proportion of people who remain on welfare 
account for the bulk of AJDC benefit expenditures, with one study 
estimating that as iiuch as 60 percent of all grant outlays are paid to only 
25 percent of all reciplsnts.^ Program assistance targeted to these 
recipients, it is claimed, may substantially decrease the costs of 
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dependency, again assuming that effective senrices can be found for this 
group, 

A study by David Sllwood maintains it is possible to identify, on the 
basis of demographic characteristics, the subgroups with a high risk of 
extended dependency. His analysis identifies young, never-aarried voaen, 
as well as women with young children, as candidates for targeting. As an 
alternative, he would let the most dependent identify themselves — that 
is, thos# remaining on welfare after some specified period of time would 
receive program services. 9 

The conditions and problems that lead to extended dependency, however, 
may not b« amenable to change with low-cost empioyabillty services. This 
question cannot be resolved in this study* Further, the subgroups many 
researchers identify as the portion of the AFT3C caseload with the longest 
expected dependency are not in the traditional WIN-mandatory category. 
Thus, in this report, the ''most dependent^ subgroups do not, in fact, 
include these cases, Sllgibllity in tile three programs was broad- San 
Diego worked with all mandatory applicants, Baltimore enrolled applicants 
and newly-mandatory recipients, and Virginia served the entire mandatory 
caseload, None, however, worked with AFDC recipients who had young 
children. Moreover, dependency in this study was measured by dollars of 
welfare received over a relatively short period: from one to at most three 
years following program enrollment. In addition, the data on which this 
analysis is based come from relatively low-cost programs that did not 
provide, for the most part, intensive services. Most importantly, no 
subgroups were singled out for special targeting attention. 

Caution is urged in considering possible targeting options for this 



WXN-iaanciatOiT population* Too narrowly defined targeting may destroy the 
value of certain services. Working with only a small subgroup say reduce 
overall effects on a caseload, even if subgroup effects are larger than 
average. A closely related question is "tracking* versus ^mainstreaming^ ^ 
an issue widely discussed in education* A tracking agenda puts high-- and 
low-achieving students into separate classes* Mainstreaming puts the two 
together, with the idea that the brighter students can assist the others. 
An open question in welfare employment programs Is whether loosely 
structured, low-cost services, such as job search workshops, can be 
effective if women with no prior work experience do not have the oppor- 
tunity to learn from others who have held Jobs. Prior job-holders, who 
often find new jobs quickly, may, in addition, provide the necessary boost 
for other participants to keep trying. ^'Tracking, or separating out 
inexperienced workers, may also create staff problems if generally poor 
success rates demoralise staff instructors. 

C. The nefflonstration of State Work/Welfare Initiatives 

MDRC^s Demonstration of State Work/Welfare Initiatives was launched in 
1982 to test the effectiveness of state employment programs for people 
applying for or receiving AFDC* For the most part, states were using their 
new authority to experiment with WIN programs authorized by the Omnibus 
Budget Reconciliation Act (OBRA) of 1981. The MDRC study includes programs 
in 11 states, eight of which used random assignment to form experimental 
and control groups for full-scale impact and benefit-cost studies* Most 
programs have the goal of increasing employment and reducing the dependency 
of the welfare population by preparing recipients for work. Thus, most 



able-bodied recipients had to participate in Job search and/or unpaid work 
experience or other activities as a condition of welfare receipt. 

The research was designed to assess three areas: the feasibility of 
ispl eventing a mandatory participation and/or work requirement; the 
prograa's impacts on eoployaent, earnings and welfare receipt; and the 
cost-effectiveness of the different appr*oaches. Findings fro® this MDRC 
study are being released as the results for each sta.t6's program become 
avadlable^ The programs in this study are examined in acre detail in 
Chapter 2. 

In the three areas studied, the evaluations generally found that 
employment and earnings ijsproved, and, in two areas, there were welfare 
savings. Also, the results for two of the programs (in San Diego and 
Virginia) indicated the initial investment of funds in the programs would 
reaiilt in government budget savings within a five-year time- frame or less. 

The subgroup impacts in these evaluations have suggested the possi- 
bility of finding better methods to serve groups within the diverse welfare 
population* For example, employment increases have generally been larger 
for clients without a recent work history than for those who have worked 
during the year prior to program enrollment. These findings are buttressed 
by research conducted by MDRC in prior WIN programs and fin1ing3 from the 
National Supported Work Demonstration.'^^ This study is abl^ to examine a 
wider variety of subgroups than were analysed in the final reports, and 
uses longer-term data with a methodQlogy more suited to the questions of 
performance measures than was possible in the previous evaluations. 
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CHAPTEH 2 

CHAHACTEHISTIC5; OF PROGRAMS ANP PftR TTCIPANTS 



This chapter discusses the similarities and differences between the 
three state progranas exaasined ia this paper: the San Diego, Baltioore and 
Virginia programs. The chapter then describes the characteristics of the 
research saaplea as well as sooe of the normal behavioral differences among 
welfare population subgroups in the absence of special services. 

A. The Program Models 

No i ^gle program model was tested in MDRC's WorJc/Welfare study. 
Rather, tht participating states implemented their own initiatives, using 
different strategies. Characteristics of the local WIN-mandatory 
populations often differed as well. 

The evaluations, on the other hand, are similar in methodology: each 
study used an experimental design whereby program enrol lees were randomly 
assigned to one or more experimental groups or to control groups. Experi- 
mental group members were subject to mandatory participation requirements 
(e.g., they were required to take f^rt in program services), while the 
control groups were barred from the special programs, although in sooe 
areas they could receive the minimal WIN services that were offered. Data 
were collected on participation measures, outcomes in employment and 
welfare receipt, and direct program operating costs. To estimate program 
impacts, the employment and welfare behavior of the experimental and 
control groups were compared over several quarters of follow-up. Because 
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randomisation had produced experimental and control groups with similar 
demographic characteristics and backgrounds in prior employment and welfare 
dependency, any statistically significant differences in behavior could be 
safely attributed to the programs ' treatments* 

In these studies, th© tena ''applicant" identifies a person applying 
for AFI5C at the time of entry into the research saaple, whether or not that 
person ^s welfare grant was subsequently approved • That label remains, even 
when the p€rson becotoes a recipient* The term •'recipient'^ refers to a 
sample member who was already receiving welfare at the date of sample 
entry. These two subgroups are important and are analyzed separately 
throughout much of this study. Other subgroup divisions are based on prior 
demographic and background characteristics* 

Table 2,1 shows the key characteristics of the programs involved in 
this analysis. The published state reports contain more detail about both 
the programs and the evaluation results.^ Briefly^ job search and work 
experience — along with education and training in Baltimore, and, to a 
lesser extent, in Virginia — were the major program services^ but states 
differed in the mix and intensity of these services, their sequencing ^ and 
the populations that received them. Programs were all mandatory, but 
differed in the extent to which participation was enforced. 

San Diego worked with all WlN-manda tory welfare applicants but did 
nut enroll recipients. Experimental^? went through a two-stage fixed 
sequence of group Job search followed by a 13-week work obligation, If they 
had not found unsubs:t ^Ized Jobs in the first phase. ^ San Diego*s decision 
to focus on applicants rather than recipients represents one targetir^ 
option available to program operators. 
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Baltimore, on the other hand, enrolled both WIM-manda tory applicants 
and recipianta, but only recipient3 who had Just become mandatory, usually 
becau3« their youngest child had turned six years of age. In order to 
ensure adequate funding on an individual basis for a somewhat broader array 
of services, the Saltlaore program restricted enrollment to only 1,000 
registrants a year. The progra© provided a mix of components (including 
job search, unpaid work experience, education and training) ^ and staff made 
service assignments according to enroilees^ needs and preferenceis, depend- 
ing on their assesamentii and the availability of open slots. 

Virginia enrolled a sarople representative of its entire existing 
VIN-oandatory caseload. The state stipulated that counties require job 
search of all enroUees but authorised, as a county option, short-term work 
experience, education and training as follov-up activities. Education and 
training were not provided by the program; rather, participants were 
referred to JTPA and community schools with inoependent funding, open to 
ail who qualified. Consequently, control group members participated in 
education and training with a frequency equal to experiiuentals. 

The treatments were relatively inexpensive, but did vary in average 
cost per experimental. For example, the cost of the San Diego program was 
about two-thirds that of the Baltimore program, which spent, on average, 
$1,050 per experimental* San Diego spent more on ensuring compliance with 
its participation requirement (which entailed monitoring, registrant 
follow-up and limited sanctioning), while Baltimore offered more expensive 
services, such as education and training, and provided client stipends. 

The different funcing levels and philosophies determined how mandatory 
— 33 measureci by participation and sanctioning rates each prograc was,'? 

ERiC [i^ 



In San Diego, less than one in 10 experimentala was not reached by the 
program: that is, few people were still on welfare, not employed, and had 
not participated in the prograo after nine months following program entry. 
This high San Diego ooverage indicates that a short-term participation 
requireoent was, in fact, realized by tha. program. In contrast, a larger 
proportion of registrants — almost one-quarter — were not involved in 
foraal activities in the Baltimore progran. Although this nay be partly 
dud to the mix of recipients with applicants, it also reflects the 
Baltimore i^taff's greater flexibility in deferring registrants. In 
Virginia, most experimental s were reached by the program (nearly go per- 
cent), but the minimum requiremr t — a loosely structured form of indepen- 
dent Job search — was relatively easy for both the prograo and the clients 
to fulfill. 

Local econotnic conditions, staff experience and attitudes also 
differed. Statutory grant maximums, based on state standards of need, also 
varied widely, making cross- program comparisons problematic. Low benefit 
levels increased the attractiveness of low-wage Jobs in some areas, and 
also increased the likelihood of a case closure when employment was 
obtained. In San Diego, welfare recipients had a good market in which to 
look for Jobs, but in rural areas of Virginia, the prospects for employment 
were limited. And, in the administrative reorganization permitted by OBRA, 
social service staffs in some states — who had recently assumed nev 
responsibility for employment: functions — had to go through a learning 
process. However, staffs in San Diego and Baltimore had substantial prior 
experience in operating employment and/or work programs, which contributed 
to their programs' smooth administration. 



Table 2.2 shows the size of the samples randomly assigned^ in the San 
Diego, Baltimore and 7irginia programs, while Table 2.3 describes sample 
characteristics, broken down by the subgroups analyzed in this paper. ^ 
Differences in the program models, targeting philosophies, and the environ- 
ments in which the programs operated created variation. As noted earlier, 
each program sei^ved the WIN mandatory caseload (which excludes women with 
children less than six yeers of age), or portions of that caseload. The 
San Diego program served only applicants, while the Baltimore and Virginia 
samples had a fairly even mix of applicants and recipients, although the 
type of recipient differed. 

The Baltimore and Virginia samples were similar in many respects: over 
half had neither a high school diploma nor a GED; more than half had been 
receiving AFDC for more than two years; and, on average, only about UC 
percent had held a job in the year prior to random assignment. The San 
Diego sample was less disadvantaged. More than half were high school 
graduates; less than 30 percent had been on welfare for more than two 
years; and one-half had held a Job in the year before this welfare appli- 
cation. Ethnic composition also differed. In Baltimore afid Virginia, 
betv.^en 60 to 70 percent of the samples were black; in San Diego ^nly 2C 
percent of sample members wer^ black, although Hispanics made up 18 percent 
of the sample. 

Comparisons of applicants and recipients in Baltimore and Virginia re- 
veal large differences in prior earnings ano prior welfare receipt. in 
fact, applicants ir ail three states were remarkably sixilar as were 



.16- 



TASLE 2.2 
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reclpienta in th« two states that served them. Among appUcanta, sosewhat 
leas than one-thii-d were first-tiae applicants; another one-thir-j fell into 
the top (►arninga category ($3,000 or asore for the pre-prograa year). In 
both Baltimore and Virginia, three-quarters of the recipients had been on 
welfare for more than two years. 

C. Earnings and Welfare Reeeint ; ma Tgonaal Range 

A wid« range of earnings and welfare behavior of WIN-a»andatory clients 
in the absence of program intervention can be captur-sd by simple objective 
oeaaurea obtained at the time of random assignment. Figures 2.1 and 2.2 
plot the earnings and welfare receipt of the early Baltimore control sample 
by selected subgroups defined by applicant/recipient status, prior 
employment and prior welfare receipt. 

The subgroup differences in the Baltimore control sample were large. 
Quarterly cverage earnings for control group applicants without a prior 
welfare history and with $3,000 or more in earnings in the year prior to 
AFDC application consistently fell into the $1,200 to $1,800 per-quarter 
range after the first year of follow-up (counting zero earnings for persons 
not employed). During the same period, subgroups with no recent employment 
history and a pattern of AFDC receipt for more than two years barely 
averaged earnings of from $200 to $400 per quarter. 

Welfare payments to control groups members also differed, depending on 
prior earnings and extent of previous welfare dependency. After three 
years, lon^r . .„ ■'nn recipients without pre-program earnings were receiving 
from three t Tour cir -s the quarterly benefit payments of first-tinie appli- 
cants. Put another way, recipients for more than twc years who had no 
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FIGURE 2. 1 
BALTIMORE CONTROLS. EARLY COHORTs 
QUARTERLY AVERAGE EARNINGS. BY SUBGROUP 
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FIGURE 2.2 
BALTIMORE CONTROLS. EARLY COHORT: 
QUARTERLY AVERAGE AFDC PAYMENTS, BY SUOGROUP 
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earnings in the pre-progran? year were only one-third of the sample, but 
consuroed nearly half of the total AFDC expenditures at tha three-year 
mark.^ A further breakdown of recipients by whether or not they had high 
school diplomas revealed that dropouts who had not obtained even a GED were 
18»0 percent of the sample but received 28 percent of the AFDC dollars. In 
contrast, applicants — about one-half of the saisple were receiving less 
than one-third of all welfare payments in Bsltisiore. 

These subgroups exhibit the full range of behavior connected with 
three major characteristics — applicant/recipient status, prior earnings 
and prior AFDC receipt. The analyses to date suggest that these charac- 
teristics are some of the best predictors of future earnings and welfare 
receipt for program eligibles, and impact differences may well be asso- 
ciated with these measures. Subgroups defined by these three dimensions 
will be the subgroups with priority in this investigation. 
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CHAPTER 3 
MSTHOPQLQgl 



Thia chapter revlevs the principal elements of the experimental 
research design and the oethodology used in this study. This chapter is 
meant as a general guide, although some of the discussion is of a more 
technical nature. 

A. g3tP#rl!T!#nt2l Design 

Any valid analysis of program impacts is based on a fundamental 
comparison between the observed outcomes of a program and what would have 
occurred without it. As explained in Chapter 1, program out^^mes are 
relatively easy to observe. But the calculation of change or program 
impact — requires estimates of outcomes in the absence of the program, 

A classical experimental design is the preferred way of obtaining the 
standard for comparison. In such designs » clients are assigned on a random 
basis to either program services, the experime nts 1 group . or to a control 
£X£iU2f which receives only the services available without the program. The 
average outcomes of experimentals, minus the average outcomes of control s, 
provide the program impact estimates^ which show the program achievements 
over and above the normal job-finding and welfare patterns of the eligible 
population. 

To maintain the integrity of the research design, no changes were made 
in the research group designations after random assignment. ^Experimentals"^ 
remained ex;:erimentals and **controls" remained controls* In the calcula-- 
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tion of outcosses, experiraentals who did not, for some reason, participate 
in the prcgraos were still counted as part of the experimental group. 
Their actiona could influence program impacts, which are expressed on a ser 
experimental rather than on a per participant basis. Nonparticipants, for 
instance, could take Jobs or leave welfare on their own or be influenced by 
the prograa^s participation requireaent (since they could be sanctioned if 
they refused to participate). 

The definition of subgroups follows this same labeling pattern. Sub- 
groups are defined by pr^«-exi,?tins characteristics at enrollment, not by 
any subsequent behavior or activity* 

B* Data Sources 

Earnings and welfare data were assembled from administrative records. 
The use of such records offers several advantages. First, adcinistra tlve 
records can be much less expensive than survey data, in part because 
registrants do not have to be re-contacted during the follow-up. Records 
may also be more accurate than survey data because they do not depend on 
client recall of dollar amounts of earnings or welfare payments. Different 
rates of response by the experimental versus the control group — often a 
source of bias in survey data — are also not expected with records data. 

Administrative records are, however, limited in their oomprehensi ve- 
ness and coverage. For example, quarterly earnings information can be 
obtained from the Unemployment Insurance (UI) system, but data on wages and 
hours worked are not available. Moreover, the information can only be 
obtained with a lag, and some delinquency in filing earnings reports on the 
part of employers is comnjcn in w^Lge-reportlng states* Another drawback is 



that stata DI systems do not noraaily record the eat'nlngs of people who 
commute to work across state lines. Given random assignment, however, none 
of these factors should affect experiaentai and control group outcomes 
differently. 

In addition, administrative records in this study contained no 
information on people other than the research sample members. They do not, 
for example, provide the earnings of other family members, whose income 
(both earned and unearned) will affect a household's welfare dependency and 
general well-being. 

The completeness and accuracy of the records data collected in this 
study were examined by comparing a small sample of data from the analysis 
tapes to the original paper or oicrofilm documents in state or county 
offices. Earnings and welfare payments were well-matched. Further, a 
comparison of records and survey data from the Louisville WIN Laboratory 
and an earlier San Diego study suggests that the two sources yielded compar- 
able information, although administrative records showed larger total 
welfare receipt than the self-reports in interviews."' 

Records data were merged with demographic and program activity informa- 
tion to form a single program data base, with a new record compiled for 
each sample member. Each record contains the client's employment 
background and welfare history in addition to a series of outcome measures 
(quarterly UI earnings, monthly AFDC payments) running from the point of 
entry into the sample (i.e., the date of random assignment) through to the 
end of the follow-up. Program activities and dates are also included. The 
earlier a person entered the sample, the more f .llow-up data are available. 
No sample aemter has less than four quarters of earnings data and ccnths 
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of welfare data. The Baltimore program, with the longest — and as yet 



unpublianed follow-up data, has the most complete infortn:^ tion — an 
additional year of earnings and welfare beyond the results reported in that 
program's final report. 

The major data sources for all the programs analysed are summarized 
belov:^ 

• Client Iniformation Sheets, one-page questionnairea filled out 
by client and staff as part of the random assignment process, 
provide information on the demographic characteristics of 
s^ple mejBbers, All principal subgroups, with the exception 
of the subgroups Identified by prior earnings, were defined 
using this information, 

• State nnemplQvment Insurance (ni^ Earnings Records provide 
quarterly employment and earnings data r^p^rted by employers 
for each calendar quarter: e.g,, January, February and March; 
April, May and June* 

• AFDC records supply information on monthly AFDC (i*e., wel-^ 
fare) grants. Monthly AFDC data are grouped by three^-month 
periods, where the first month of the first quarter of 
follow-up is the month of enrollment. 

e OnefflPlQvment Insurance Benefit Records supply infonnation on 
monthly tJI benefit payments. 

e Pro^r^g^ Ag^iYi^Y records provide information on program 
services, participation and dereglstration. 



Since random assignment can occur in the first, second, or third month 
of a calendar quarter, the first quarter of 01 earnings can contain pre- 
program earnings for some sample members. The first quarter of earnings is 
therefore not considered a clean follow-up quarter m the impact analysis 
and is omitted from cumulative estimates of program impact. 



C. Choice of FoIlQw-uD Period 

MDRC's research to date has shown certain patterns cf outcomes for 




«xp«riaentais and controls over time. Typically, the outcotoes for expert- 
mentals and controls were slallar in the quarter of random asslgntaent but 
began to differ in quarter 2. (Hovever, many experioentala did not Join 
activities for as long as six months after enrollment.) The experloental- 
control differences grew slowly, with the difference often peaking at the 
one-year point or beyond. 

This paper divides follow-up into an iamediate post-random assignment 
period (quarters 1 through 3) and a longer-tenn follow-up period (quarters 
4 and following). Quarters were averaged — which helps to eliainate some 
of the transitory quarter-to-quarter variation in earnings. Earnings, as 
well as eoployoent, welfare incidence and AFDC payment;', are expressed as 
quarterly averages p«r person. Averages for tha immediate and longer-term 
outcooea were calculated separately. It should be emphasised that the 
longer-term average contains more quarters of data for persona who entered 
the reS' arch early. This averaging procedure has the disadvantage that it 
does not explicitly estimate quarter-by-quarter time trends in impacts. 

The longer-term follow-up period was selected as the focus of this 
subgroup analysis because it best represents both post-program outcomes and 
impacts. Subgroup differences appearing in the later quarters are the best 
indicators of long-run effects and are therefore likely to be more indl- 
cative of the total impact differences among subgroups. The training 
activities and education programs in Baltimore, which run in duration for 
as much as one year, require a long follow-up period, with an emphasis on 
later periods. Onfortunately , the follow-up in Virginia was only fc 
quarters for the substantial portion of the sample. 

Statistical tests were conducted and are reported for differencec 
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tetveen experiaentaia and controls within subgroups, While the differences 
tetveen , n ag flc . ^ ^ -or pairs of subgroups were also tested, they were not as 
frequently statistically significant. The results of such tests are 
omitted from the tables but are occasionally mentioned as appropriate, 

D. The Subgroup Impact Regression Model 

A simple difference between average outcomes for experimental and 
control groupe la sufficient to express reliable impacts in a carefully 
implemented experimental design* The use of linear regression taay, however, 
lend extra precision to the estimates and correct for minor differences in 
pre^program characteristics between experisentals and controls. For this 
reason, the estimates reported in this paper are regression-adjusted. 

In addition? regression techniques have been used to produce two set^ 
of subgroup impacts. The first set takes the point of view of the program 
administrator who asks: **Can I improve efficiency by targeting services to 
registrants with a single subgroup characteristic?** For example^ it may be 
useful to find out if sample members with a high school diplotca have differ- 
ent impacts than those without diplomas, ignoring differences in any other 
demographic characteristics. These impact estimates are vn< ;pn d, it j-g n gl , esti- 
mates, and this type of estimate is the focus of Chapter ^. Such subgroup 
estimates do not take into account impact differences asi',ociated with other 
demographic and background characteristics. For example, women without a 
high school diploma generally have a weaker work record, but unconditional 
sstimates do not explain what part of the diploma effect is due to the work 
history characteristic. Pegression, in this case^ serves or.ly the purpcr^^ 
of increasing precision and adjusting for minor pre-exi^' t ing exirerixer.Ni 1- 



control differences, 

?WQ or aort characteristics can be included in unconditional estima- 
tion as interactions, and these are often useful to program operators. To 
continue the exajople above, the sample may be split four ways: persons with 
and without diploraa, further divided by employed/not employed in the recent 
pre-program period. Impacts calculated for each of these four subgroups 
may answer the question as to whether it is worthwhile to target services 
to a narrow subgroup defined by diploma and prior employment status. Ttiis 
approach provides Information about targeting on the basis of two subgroup 
characteristics; without controlling for other factors. 

Regression analysis can lead to another set of estimates — condi- 
ilflHal estimates — that may reveal the associations of underlying factors. 
Conditional estimates hold all subgroup characteristics constant except the 
one in question. That is, any conditional impact difference associated 
with a high school diploma would indicate the importance of the schooling 
credential itself, eliminating effects due to prior employment recori and 
other characteristics. If conditioning on prior employment status 
nullified the diploma effect, then the prior-employment difference across 
diploma subgroups may be considered the ^real^ reason for the diploma 
impact*^ 

Both urconditional and conditional estimates are important, depending 
on the quest- ons asked. Unconditional estimates are presented and discuss- 
ed in the next chapter because they address questions of targeting with 
limited information. Conditional estimates, however, are required for t^e 
testing of performance measures in Chapter 5* They will be iiscxiztei in 
Chapter only insofar 33 they raise issues regarding the conclusions of 
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A handful of prior studies have attempted to test the correlation 
between various measures of performance and net progran impact. These 
studies did not have experimental comparison data^ but their techniques are 
similar to the ones used in this study of performance measures. 

The basic approach is as follows: 

1. Obtain an estimate of net program impact for each individual 
in the treatment group; 

2. Create a measure of program perfonaanci? — e-g-i did the 
sample m^ber enter employment, what were his/her wages? 

3» Compute correlation coefficients between the net impact and 
the performance measures, with measures with the greatest 
correlation being identified as the ''best'' performance 
indica tors; 

As a supplemental analysis, determine whether two i;dicator3 
work better than one. Compute a regression of net impact on 
two performance indicators and report the coefficients and 
their statistical signif Icance* In this way, it may be 
possible to determine that one indicator has more power than 
another or is a useful supplement. 

This procedure has remained approximately the same since studies correlated 

performance measures with the impacts of certain pre-CETA employment 

programs.^ 

The difficult part of this process is the first step: the estimation 
of a net impact for each individual. 5 pr^jr studies estimated individual- 
level impacts without experimental data, and thus have had to depend on 
impact estimates from participant/nonparticipant comparisons adjusted by 
regression for various demographic and participation variables, such as 
type of treatment and length of stay. Thus, while these studies have usee: 



eaaentially the aame procedure to estimate indirldual Impacts, the 
estimates they have generated may be biased insofar as the regression 
models used were not able to control for all observable and unobservable 
differences between the participant and nonparticipant groups. 



CHAPTER 4 
SUSGHOUP DIFFgRCTCES IN IMPACT5; 



Thl3 chapter suaiaBarizes the findings of an analysis of prograai impact 
differences for subgroups of the WINHauindatory AFDC caseload in San Diego, 
Baltimore and Virginia, Using the data and statistical methcds described 
In the last two chapters, the analysis develops estimates of employment and 
welfare impact differences and then explores some implications of those 
differences. Subgroup differences in program costs are also briefly 
consic'ered. Additional results on the benefit-cost implications of the 
subgroup differences are available from MDHC. 

The thrust of these findings is that when people were defined in tenns 
of their prior work and welfare history, the least dependent WIN-nnandatory 
applicants and recipients generally experienced belov-average program 
impacts and often the smiallest impacts overall. These findings suggest 
that a policy of targeting programs only to those in the WIN-manda tory 
caseload who are most ^job-ready* would not be efficient. Impacts were 
much larger for subgroups who were less job-ready. There is some evidence, 
however^ that focusing narrowly on only the groups who normally receive the 
most welfare may not be desirable — at least with the relatively short- 
term, less intensive services offered by these three programs. As the 
chapter will delineate, these broad conclusions are drawn from a complex 
series of results and are subject to a number of important qualifications. 



Before the results for oach of the five experimental saoples — San 
Diego appllcant5, Baltioope applicants, Baltiaore recipients, Virginia 
applicants, and, Virginia recipients^ — are presented, the overall iapact 
differences are axamined collectively across the different programs and 
ATDC groups. (See Table 4.1.) The five sets of eaployment and welfare 
impacts have been combined into a single set of composite estimates as a 
sianaary device, 2 These estiaates do not indicate the variation by program, 
but these differences are addressed later. 

The impact estimates were calculated using data collected in the 
fourth quarter after random assignment and in all subsequent quarters 
through to the end of the follov-up periods. By the fourth quarter, most 
members of the experimental groups who were participating in the programs 
had already finished the activities or were no longer subject to the 
participation requirements. Thus, the impact estimates generally reflect 
the post-program experience of the samples, and are probably the best 
available indicators of the longer-term effects of these programs on the 
experimental groups. 

1- Full Sample 

As the composite estimates show, the average quarterly employment rate 
in this period was 38 percent for the experimental group compared to 3^ 
percent for controls. The impact of 4 percentage points is statistically 
significant. Similarly, experimentals who worked earned an average of 
$1,679 per quarter which, taking into account individuals who did not work, 
totaled to $638 per quarter for all experimentals — a figure that is $87 
higher than the average earnings level of controls, for a 16 percent 
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ijaprcveaent. 

A clear subgroup pattern underlies these eaployisent Impacts for the 
full sample. Employment and earnings impacts for sample sembers with no 
earnings in the year prior to random assignment were larger than the 
overall average employment and earnings impacts, while the gains for the 
subgroups who did have prior earnings were smaller and usually not signi- 
ficant* The programs had raised the employment level of the less employ- 
able group by 5 percentage points compared to only 3 percentage points for 
the other two groups with prior earnings. This is the case not because 
members of the experimental group with better earnings records were less 
able to find employment than those with poorer records. In fact, they 
entered employment more frequently, as one would expect. Almost 50 percent 
of the experlmentals with $1-2,999 in prior earnings, and 62 percent of 
those with even more» were employed during the follow--up period a much 
higher level than the 26 percent level achieved by experlmentals with no 
previous earnings. But their employment was less of a gain over that of 
their control group counterparts. 

Similarly, sample members who had been on welfare (i.e., had their own 
AFDC case) in the past showed significant gains in both eisplcyiijent ' and 
earnings, while those with less dependency experienced virtually no change. 
Controls with prior welfare had a lower employment rate on their own than 
those who had not been on welfare, but the more welfare-dependent experl- 
mentals made the greatest gains. Thus, using the simple measures of prior 
work and welfare experience to categorize individuals, the less employable 
and more dependent subgroups had the largest employment impacts. 

A similar, although somewhat weaker, pattern can be seen in the 
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composite welfare impacta^ Overall, the average proportion of individuals 
in the experimental group who received welfare each month was 1.3 percent- 
age points lower than for controls; they also received $16 less in AFDC 
payments per quarter, with the latter impact statistically significant. 
The subgroup with no prior earnings had relatively high impacts* 
2. Applicants ^nd R^cintgnts 

Tables 4.2 and 4.3 break down these composite estimates for AFOC 
applicants and recipients. The applicant impacts are larger th&n those for 
recipients despite the fact that their participation rate in program 
services was somewhat lower. In fact, quarterly applicant employment 
impacts were about double those of recipients* A supplemental analysis — 
in which applicant and recipient data were pooled and the impacts estimated 
separately for the two groups with demographic differences controlled — 
indicated that the impact differences stemmed from the applicant/recipient 
distinction, and not from other factors. 3 

The separate subgroup results for applicants and recipients show a 
similar pattern to the overall estimates. Applicant impacts (shown in 
Table 4*2) indicate that the employment and earnings gains, as well as 
welfare savings, were lowest for first-time applicants — those who 
reported never having had their own AFDC case. Some of these individuals 
may have received welfare on their mother's grant as a minor, but as a 
group they are clearly the least welfare-dependent* Impacts were also 
ic^^est for applicants with the best prior earnings records, with applicants 
in the two lower-earnings categories having significant impacts of similar 
magnitude. Similarly, employment and welfare impacts were largest (and 
statistically significant) for applicants who had been on welfare before, 
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although tna langth of tiae on welfare does not seeai to have niattered auch. 

Among recipients, very few people had high prior earnings or had never 
had their ot<n AFDC case, making the corresponding subgroup iopact estimates 
too imprecise to be reliable; results for these groups have thus been 
dropped from the table* The great asajority oi' recipients fell into the ''no 
prior earnings'' and *more than two years of welfare* categories. Recipi* 
ents without earnings in the previous year had statistically significant 
employaent and earnings gains, although the gains were saaiier than for 
applicants in the saasa subgroup. On the other hand^ recipients with Eodest 
prior earnings had no gains at all in fact, they registered an earnings 
loss. On the welfare side, savings^ were «2all, although subgroup 
differences were in the same direction as the earnings gains. 

The composite estiisates thus provide dichotoiaous evidence on program 
effectiveness. For applicants, program services were generally effective 
for everyone except the most employable subgroups, for whom the services 
made very little difference. Interestingly, the impacts on the moderately 
and very dependent subgroups were about the same. However, program 
services were generally less effective for recipients, who typically 
include the most dependent individuals of all. 

Characteristics other than prior earnings and welfare receipt — such 
as education and marital status — were less consistent across program 
samples. They sometimes appeared Important, however, in determining the 
different subgroup impacts of different programs. 

B. Estimatea for the Five Samples 

In most cases, the patterns noted in the composite estimates hold up 
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acrcfls the five samples in the analysis, although there were soice Incon- 
sistencies across program samples. There may be soaje Important inter- 
actions betveen subgroup characteristics, program features and economic 
conditions that determine the looal patterns. While it is too e^^i ly in the 
research to analyze suoh interactions, the available evidence from these 
samples does suggest soma proisiising directions for future research. 

The following three tables present subgroup impact estimates for each 
of the five samples. Tables 4.4 and 4.5 show impacts for the same two sets 
of subgroups considered in the composite estlmatest while Table 4.6 
considers impacts on earaings and welfare payments associated with other 
subgroup characteristics. As with the composite results, the impact 
estimates start ^t the fourth quarter after random assignment and go 
through to the enu of the observation period; estimates for the first three 
quarters, as well as for the follow-up period as a whole, were also 
calculated and are available from MDRC. The short-term results are 
generally consistent with the longer-term impacts, although their magnitude 
varied in some program settings. 

1 . San Diego 

As Chapter 2 indicated, the welfare employment program in San Diego 
differed from the others in two important respects. First, the program 
served only welfare applicants. Second, all enrollees had the same 
ahort-term sequence of program activities — job search followed by work 
experience. McrGcvor, participation rates were high for all subgroups. 

The San Diogo findings clearly indicate that the program had its 
greatest impacts on the leas Job-ready and more welfare dependent 
applicants. Those with the lowest prior earnings (zero dollars for the 
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year prior) had by far the largest earnings iapacts, while welfare savings 
wer« spread evenly over the two lower-earnings subgroups. Similarly, 
applicants with a welfare history had most of the earnings gains and 
welfare savings^ although both impacts were somewhat greater for the group 
with a briefer welfare stay. 

Some characteristios associated with dependency and employability 
other thr^n prior earnings and welfare history app€ar to be positively 
related * ^ the prograa^s iapacts in San Diego. The results for subgroups 
presented in Table 4.6 suggest, for example, that race and the number of 
children in a household were important in this sample. These are clearly 
aspects of dependency, given the low earnings and the high welfare payments 
made to control group mesnbers who were non-white and had acre than one 
child. Some of the other results are not consistent, notably the greater 
impacts for applicants who had a high school diploma or GED, a factor not 
usually related to long dependency. However, this may be due to the nature 
of the San Diego program. Prior education may have Increased the 
probability of success in a program that (unlike Baltimore and Virginia) 
did not offer remedial education. 

The subgroup results clearly Indicate that the San Diegc program had 
greater impacts on its most dependent applicants. Dependency, defined here 
as ^having high welfare payments and low earnings," can be viewed as 
falling along a spectrum, ranging from the most to the least dependent 
cases, and as involving many characteristics, rather than Just two. To 
assess the relationship between dependency, viewed this way, and program 
impacts, a ''dependency score^ was assigned to each person in the San Diego 
sample on the basis of a number of pre-program characteristics.^ 



In Figrure 4.1, sarninga inpacta estiaatdd for individuals werfs plotted 
against their dependency scores. The bottoa graph for the San Diego sample 
depicts the impact "responsiyeness" of individiials with different levels of 
dependency. While the figure suggests that the San Diego program model was 
soasevhat lass effective with applicants at the two ends of the dependency 
spectrum, it also indicates that this or a similar program model was 
effective for a broad range of AFDC applicants. Even relatively dependent 
welfare applicants benefited somewhat from the treatment, which suggests 
that they should be included in such programs. And, while short-term job 
search and work experience may not always be helpful to relatively job- 
ready applicants, it was on average beneficial in San Diego. 

The figure also shows a possible threshold effect: at some level of 
self-sufficiency and job-readiness, program impacts increased and, as seen 
in this graph, then began to decline again — this time, for the least 
dependent in the sample. The Baltimore program, described below, is better 
suited to a discussion of this potential effect, since it enrolled a broad 
range of both applicants and recipients. 

2. Baitifflgre 

The Options program in Baltimore was very different from the San Diego 
initiative. Newly-mandatory AFDC recipients were enrolled as well as manda- 
tory applicants. In addition, there was a greater range of services — 
from independent job search to education and training — and the services 
could vary according to the registrants' needs and preferences. Parti- 
cipant choice, however, was constrained by staff appraisal and slot avail- 
ability. Because the least job-ready generally participated at higher 
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levels in tne more intensive services — work experience, education and 
training — than other cases, the subgroup impacts may have been influenced 
by the different services participants received, as well as their own 
characteristics. 

The Baltimore results in this paper are based on an extra year of data 
collection, so the findings are somevhat different from those presented in 
the filial report on the Options program. In the final report, it was 
suggested that the prograa treatment — which could include service:^ that 
lasted for a year or longer — might lead to greater impacts at a later 
point. The additional data support that speculation. Overall earnings 
impacts continued to increase. Experimental-control differences in 
earnings for applicants were, in fact, more than 50 percent larger, and the 
earnings of recipients increased as well. Welfare savings, which earlier 
were not significant, did not change. 

Two Principal findings are apparent in the Baltimore subgroup impact 
results in Tables 4,4 and 4,5* One is the small earnings gains for 
recipients compared to applicants. As a whole, applicants in the program 
earned $172 more per quarter than controls, a statistically significant 
increase of 21 percent that is comparable to the change for applicants in 
San Diego, However, recipients, who were more welfare-dependent than 
applicants, earned only $37 more. The difference between applicants and 
recipients is statistically significant. These findings are especially 
important since recipients were not under-served, and the follow-up period 
was long enough to capture short-term program effects from education and 
training. 

The pattern of subgroup results for applicants shows that these with 



the highest pre-prograis earnings or without a welfare history had the 
smallest gaina* The reaaining large share of applicants did experience 
statistically significant earnings impacts. Among recipients, only those 
without pre-prograa disployiaent experienced a statistically significant 
earnings increase.^ No significant welfare savings were found in this 
longer follow-up period for either applicants or recipients or for any 
other subgroup. 

Table 4,6 shows that Baitiinore differed frees San Diego in other sub- 
group earnings impacts. Applicants without a high school diploma or GED 
had larger impacts, perhaps reflecting the remedial education services 
offered by the ?C7.i-.ns program. Younger women and women with younger 
children also experienced somewhat lai*ger*than-avarage gains. These 
factors operated differently in San Diego. 

The top graph in Figure U,i plots the earning impacts of Baltimore 
applicants and recipients against individual dependency scores in the 
manner described for San Diego's applicants. Th^ Baltimore program Is 
particularly appropriate for such an investigation, since a broad spectrum 
of people, from first-time applicants to long*tenn recipients, were 
enrolled. The recipients, although all newly-mandatory, had often beer en 
welfare in WIN-exempt status for some time. The combination of mandatory 
applicants and newly-mandatory recipients might be typical of an incoming 
group in a steady-state mandatory service program. However, the Baltimore 
program was limited to 1,000 slots to ensure adequate resources to serve 
the full range of enrollees. 

The Baltimore graph, even more than in San Diego, lends substance to 
the threshold idea. It suggests chat earnings impacts were largest for 



indivlduaiis in the aiddle-<3ependency range, peaking at a point scmewbat 
above the median. At the very left of the dependency spectrom, very 
dependent aaaea did not appear to respond to the seinrlcea as well as other 
oases. Beyond some threshold level of self-sufficiency and Job-readlnesa, 
prograa impacts increased. But at the other end of the spectrum, the 
prograus again had leas effect — this time on the people who were relative- 
ly well-prepared for jobs. These Job-ready people seemed more able to 
enter eaployment and leave welfare without program help. 

Two factors are important to note. The level of dependency was more 
extreme in the Baltimore sample because San Diego did not serve recipients. 
And, the shape of the curves for both programs would presumably change if 
the models or eligible populations changed, 

3. Virginia 

Virginia extended program participation requirements to the whole 
WIN-mandatory caseload of recipients as well as mandatory AFDC applicants. 
It also served rural as well as urban areas, and counties had considerable 
independence in implementing the program. ^ Resource constraints, however, 
were important: the program relied on job search assistance as its 
principal component and on independent job search as the most widel^'-used 
kind of job search. Community providers, such as schools and JT?A training 
programs, which received no prograis funding, provided the education and 
training. Because controls obtained theLo education and training services 
on their own with about equal frequency as the experimentals, the Virginia 
impacts can only be attributed to job search and work experience.'^ 

Virginia has the shortest follow-up of the three programs — only 4 to 
6 quarters — depending on the time an individual entered the sample. The 
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ifflpact astimates are therefore preliminary, and should be Interpreted with 
more caution than the others^ 

Statistically significant employisent and earnings Impacts were found 
only for applicants, not for recipients. As in Baltimore, the partici- 
pation le^^ls of recipients equalled , ^^eeded those of applicants, but 
recipient earnings impacts ware about f those of applicants and not 
statistically significant. Welfare reductions were not statistically 
significant for either applicants or recipients. 

Within the applicant group, sample members without recent pre-prcgram 
employment experienced statistically significant increases in employment 
and reductions in the proportion on welfare. The highest prior-earnings 
group improved the least in both employment and welfare* However, the 
middle group — based on prior earnings recorded the largest employment 
impacts. Applicants with no prior welfare had the smallest employment and 
welfare impacts. Earnings gains did not uniformly follow employment gains 
and, in fact, the impacts did not fall into the usual patterns seen for 
other high and low prior-enrnings subgroups. These inconsistencies may 
stem in part from the limited follow-up data. 

For recipients, the larger impacts were recorded for individuals with 
more welfare experience, although the prior-earnings categories did not 
exhibit much difference. 

Results for the other subgroup categories in Virginia, which 3re pre- 
sented in Table 4.6, show impact differentials among applicants of compar- 
able magnitude to the differentials in the otlier states* The direction of 
these differentials^ however, is not always the same. These variations in 
patterns across states s^iggest that different dimensions of dependency and 
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employability may dominate in particular program settings — reflecting 
differences in prograoa, caseloads and labor isarkets, 

C. SubigrQUD Combinations 

One of the implications of the p'^^ceding analysis is that, not 
surprisingly, specific subgroup characteristics may differ in iapcrtance in 
different prcgras settings. This implies that different dependency and/or 
etaployabillty criteria may be critical for a given program model or for a 
w- .fare caseload in certain locations. As a result, it is possible thac 
subgroups defired in terms of many characteristics rather than just one may 
predict impact lifferences mora consistently. 

The combination of weak prior tiarnins^ with longer welfare history was 
used to define a sore dependent portion of the sample. Table 4.7 presents 
Impact results for four pairs of such subgroups. One pair shows applicants 
with prior earnings In the two lowest categories plus a welfare history 
versus those with either relatively high prior earnings or no welfare 
history. A similar split is made fcr recipients: no prior earnings and 
more than two years on welfare in one group against all other recipients in 
th* 3econd group. Two additional pairs were created by adding ~ for the 
two more dependent groups — the factor that group members did not have a 
high school diploma. 

The results suggest that subgroups defined by combined work and 
we! fare criteria may be more consistent predictors of impacts — at least 
of earnings impacts — than either characteristic alone* In ail three 
applicant cases, the low-work, high-welfare subgroups experienced larger 
earnings impact,s, al^^ough the differences for recipients were minimal. 
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The addition of ^Uacics diploma*^ tended to reduce theae differences. 

In developing welfare emplcyment policy, program iiapacts on eaploy- 
ment, earnings, welfar^^ receipt and other outooaea must be weighed against 
prograo oost^. Tbia seotion briefly describes the cost differences by 
subgroup in the the San Diego and Baltimore programs and discusses the 
implications it these differences for the results of the overall analysis. 
A more detailed discussion of costs together with an assessment of the 
benefit-cost implications of the subgroup impact ana cost differences in 
the benefit-cost analysis is available from MDRC, 

Table 4.8 presents total program costs, expressed on a per experi- 
mental basis for the San Diego and Baltimore programs. The .^igures include 
the costs of serving nonparticipants as well as participants in the expert-- 
mental groups, and are broken down bj major program conpcnent. They are 
also disaggregated for the two lyajor subgroups based on prior earnings and 
welfare experience. 

Overall, subgroup variation in cost was small compared to the varia- 
tion in impacts, particularly in San Diego, which haa the same treatment 
sequence for all ei^ jllees. Also, beciiu^e that program was not long and 
education and training were not included in the sequence, costs were not 
large. The major components were group Job search, work experience, assess- 
ment and support services. In Baltimore — where total costs v^ne higher 
— relatively expensive services, such ar education and training, were 
usually assigned to the less Job-ready registrants* Thus, subgroup costs 
did vary somewhat, but the costs of services were closely related to 
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specific groups — for exaapls, individuals without a high achocl diploma 
received mora costly reaedial education services than enroUees who already 
had a diploma. 

Prior AFDC receipt was the moat important characteristic associated 
with higher costs in both programs. The group with the longest stay on 
welfare — more than two years — had the highest costs. People in this 
subgroup stayed in the programs longer and, in Baltimore, were assigned to 
the expensive services aore often. 

The limited cost differences support the conclusions already reached. 
First, serving the less Job-ready and the more welfare-dependent is cost- 
effective: while it costs slightly more to work with people who have been 
on welfare for a longer time, the net impacts on AFDC and employffl€»nt are 
substantially larger than for the less dependent caae^j. Second, the less 
job-ready and the more dependent welfare applicants and recipients gain the 
most financially from these programs. Their earnings gains were generally 
more than enough to offset their reduced welfare benefits. 
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CHAPTEH 5 
MEASURES QF py?nr.RAM PgHFQRMA?jC!Tr 



A5 described in the first chapter, the best measure of perfonaance In 
a welfare eapioyment prograa is its impact on the people it serves. But 
genuine impacts cannot be obtained simply or quickly enough to be used In 
the managdment of most programs. This chapter assesses the value of two 
more practical measures of performance: first, **Job entries" (including 
placements) and cases "^off-welfare" (or welfare departures). It then 
discusses program participation and '^coverage*' indicators. 

A- Job-Ent^T^ and nff--Welfare Measures 

Program impacts require a comparison framework — ideally, an experi- 
mental research design and data c<dl^ction over a long enough period to 
fully observe the course of changes g^^nerated by a program. Unfortunately, 
much as administrators might want this information, they must make 
decisions in a reasonable period of time based on data that are readily 
available. Thus, the performance of welfare employment programs is usually 
determined by counting the numbers of registrants who obtain jobs and/or 
leave welfare. However, these measures overstate program Impacts because, 
as the experience of the control groups in this analysis has shown, many 
recipients find Jobs and leave welfare in the absence of program 
assistance. This means that -rome programs' high rates of job entry may 
result from their having relatively ^job-ready^ caseloads or a strong labor 
market, while the apparently poor performance of other prograjLS say stem 
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from Ies3 advantageous conditions. 

More serious than sl^iple overstatement is the fact that, while change 
takes place when a recipient finds work or leaves welfare because of the 
prograiBi the deisree of that charge varies by type of ^ndividual, A Job 
entry for a recipient who has not worked for several years inspiies tsore 
change than a Job entry by a person who has recently worked. This type of 
change — or the degree of program success — cannot be seen in unadjusted 
outcome measures. For example, conscientious program administrators 
seeking high Job-entry rates may focus staff time and resources on placing 
relatively Job-ready registrants, many of whom might have been able to find 
Jobs on their own. 

1* How Bad Are the Qutgome Measures? 

Using estimates of program impacts obtained on an individual basis are 
a logical way to assess the extent of the problem with outcome measures: 
the poorer the correlation between the performance measures and the 
impacts, the more serious the problem. Consequently^ short-term Job entry 
and off-welfare measures were examined in relation to program impacts on 
earnings ard welfare payments in the San Diego and Baltimore programs* 

A short-term Job entry was defined as '^employed at some point during 
quarters 2 or 3 after random assignment," and off-welfare was defined as 
^receiving no welfare payments in the third quarter.** Somewhat longer-term 
measures took into account quarter 4 and the following ones for employment; 
quarter 6 was the point-in-time for welfare payments- In this report, the 
Job entry data were UI earnings^ which are more acourrite and complete than 
typical program placement data,^ but are ie^s accessible to program 
operators. 



Tables 5-1 ana 5*2 diaplay, in summary form, the results gf correlat- 
ing the Job entry and off-welfare outcomes with program earnings and 
welfare impacts estimated for each experimental group member on the basis 
of regression results froia the previous chapter. ^ The indicators are 
ranked in the table as *good* (positively correlated with impacts and 
statistically significant), ''fair*' (positively correlated but not sUtisti- 
cally significant), '♦weak'* (negatively correlated but not statistically 
significant) and ''poor* (negatively correlated and statistically signifi- 
cant). Rankings are provided for all short-term versions of these 
indicators. If the longer-term version indicated substantial improvement, 
the higher rank is shown in brackets. 

It is clear that job entries were not satisfactory performance 
measures for the San Diego or Baltimore programs. In all cases^ short-term 
job entry was a w^ak or poor indicator of earning impacts, and the longer- 
term version showed little improvement. This suggests that the simple Job 
entry measure of performance is inadequate and may even encourage reduced 
program perfonaance in terms of impacts* Job entries were also not a 
satisfactory indicator of welfare savings. 

The off-welfare measu^ c also performed poorly. Most of the correla- 
tions with earnings gains and welfare savings were poor to fair. 
Interestingly, the off«welfare measure performed marginally better as an 
Indicator of net welfare savings. One of the cases yielded a result of 
''good* 

These findings are consistent with those of the previous chapter, 
supporting the conclusion that performance standards ba^i^ed on Job entry or 
off-welfare rates are unrelated to true program effectiveness. The 
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findings should mX. i5« interpreted as suggesting it is wrong for programs 
to promote Job entries or case closures. Hather, the results show that the 
outcomes of a program are closely tied to the characteristics of program 
registrants, and standards that ignore this fact provide a misleading 
picture of real program accomplishments. Such standards nay also cause 
prograa funds to be poorly allocated. 

2. Can Setter Measures Be Developed? 

Up to this point, a Job entry has bad equal value for all WIN-manda- 
tory clients, regardless of theit- woric and welfare histories. But the 
preceding chapter suggests a different scoring strategy one that gives 
more weight to Job entries of registrants with weaker previous work records 
or longer time on welfare. 

To explore this strategy, the Job entry and off-welfare variables were 
calculated using a number of different weighting schemes indicating low 
earnings and high welfare dependency. Some of the indices were created on 
the basis of predicted levels of experimental s ' earnings and welfare 
receipt. Others merely assigned extra points for Job entries recorded for 
persons with low levels of ore-program earnings or with a long welfare 
history, using the definitions of the preceding chapters. The correlations 
improved in severax cases, suggesting that the Job entries of less 
employable welfare r-eciplents should be given extra weight in setting 
performance standards. 

Some of the tested weighting schemes used complex, regression-based 
indices. These methods require a complete demographic profile by enrollee 
and proper weights for each characteristic. Whilp this approach may be 
suitable for aggregate analyses where proper weights can be calculated 



for local labor market conditions and AFDC statutory grant levels — they 

have drawbacks as a tool for local operators and caseworkers. The extra 

data collaction is costly, and calculated scores for each enrollse would be 

subject to error. Perhaps most importantly, the complexity of the 

information may obscure the operational priorities line staff need» 

The alternative approach uses information about only the most 

important registrant characteristics ^ namely, prior employment and 

welfare experience. One such measure was created for Job entries based 

only on prior employment: 3 

i Not employed in year prior: 4 points per Job entry 

$1-2,999 earnings in year prior: 2 points per Job entry 

$3,000 or more earnings in year prior: 1 point per Job entry 

The correlations of this weight3d measure and welfare impacts are 
summarised in Table 5.3* A positive correlation between the indicator and 
the impact was found in all but one case, and the correlations were 
statistically significant in three instances. The longer-teric version of 
the weighted indicator improved the rt^uits* Job entries were positively 
- and significantly correlated with ail earning Impacts, and were also 

positively correlated to welfare Impacts. 

Job entries weighted this way were also positively correlated with the^ 
total net value of the program both to program registrants and to govern- 
ment budgets. These value estimates combine earnings and AFDC Itspacts, 
based on per-person estimatr^s, well as individual estimates of net 

program costs and program effects on taxes, Medicaid and other outcome3. 
Cverall, then, job entries weighted by a person's prior earnings provide a 
Slspl e^tc-use perforn^ance measure that, perforce 1 much oetter than 
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controlling for tha ahort-tara Indicator* If tha partial corraLatlon qV a 
longar-tara varaion ral aad tha Indlcator^a ranl»; f r o« tha t» o Isaar to thi o 
highar ratlnqai or froa "fair** to •goodf** that changa la notad In drackats in 
tha tadl a* 

**Short- and longar tara'* art ^i^fin^d aa followat 



Short'-'tara joP antry 
Longar-'tara jofcJ antry 
Short-tara off-aalfara 

Uongar-tars of f-tt «l f ar s 



Any UI aarnlnga In quartars 3 or 3 

Any UI aarnlnga In quartara 4 through laat 

No AFOC payaanta In quartar 3 

No AFDQ paysanna In quartar 6 



iaighta mtm aaalgnad to j a5 antry acoraa on tha tsaals of prior aarn^ngaj 



Not aaployad In yaar p'lor 
«1-2999 aarnlnga in yaar prior 
13000 or aora aarnlnga In 
yaar prior 



4 points par job antry 
2 points par job antry 

1 point par joD antry 
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unweigfttad job entries and case closure aeasures in the San Diego and 
Baltiaore programs. 

Hevertheleas, this weighting scheme is not the final word on perforTs* 
anoe seasuresccnt • EmployTsent-^based measures may not be appropriate for 
programs with different goals (although the general principle of weighting 
could be adapted to other i .jom^ aieasures, such as wage levels and job 
retention rates, which ar® not tested in this paper)* Problems still 
remain in certain areas: i,e,, determining the ''points** to award job 
entries achieved without prograis assistance. Moreover, ia^pacts alone may 
not provide the comprehensive picture of program participation sought by 
many administrators. 

The next section briefly considers th*- Implications of this research 
i.. developing some alternative pf*rf onn.^nce measures of client activity in 
program components. 

B. PartiglgatiQg and Zqs^t^f^ 

Performance measures based on program participation have oftan been 
used as an alternative, or an addition^ to employment and welfare outcome 
measures. Compared to outcome meesure.-Sy participation ra^^s have both 
advantages and disadvantages. One clear advantage is that participation 
can be easily observed in the short term. One disadvantage is that the 
"intensity^ of participation may not be easy to measure. For example* 
registrants in independent job search are counted as participants, but some 
of them have very little to do* 

In- program activity measures have been important because many v lew 
participa tion as a precondition for impacts* However , such measures have 



two problems. First, in mandstory programs, an "active participation'' 
count ignores a good deal of prograo activity, much the sane as a place- 
ment rat© is a limited neasure of progratn effects on etaployment. In 
jaandatory programs, the behavior of nonparticlpants is critical since 
nonpar tici pants may lock for srsd find work or leave welfare in lieu of 
participating. Sanctioning and other program contact with ncnparticipating 
individuals are explicitly intended to affect their behavior. 

Second, participation may be less closely linked to impacts than 
short- terrs outcomes. Participation measures may cause staff to focus on 
the provision of services, whether or not individuals need them. A drive 
for high participation levels r^ay result in program expenditures on those 
who are most likely to leave welfare on their own. 

If participation measures are used, the subgroup impact findings ir-3i- 
cate that priority should be given to registrants with poor worf records. 
The same weighting scheme just applied to job entries can be used tc 
develop weighted participation measures. 

Another approach with considerable potential la the use of program 
"coverage" measures. Such measurtis have only been used in evaluation 
research, and have yet to be developed for use as program performance 
indicators. These measures would count, in addition to cases of participa- 
tion g^r ggt cases in which participation is no longer required or where 
sanctions for n->npartlcipation have been imposed. The concept of coverage 
takes into account the normal welfare caseload turnover, but it does 30 
without requiring information about prior employment and welfare and need 
not involve weights. 

Under a coverage formula, a client might be counted as '♦covered" by 
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program requirements if any of these outcomes is achieved: 

1. CoBpletas or is coapletiag program re^^uirements; 

2. Bdco'«fies etopXoyed; 

3. Leaves AFDC; or 

4. Is sanctioned for nonparticipation. 

To amxiffli2e coverage, the focus of administrators is automatically 
directed to the longer-term recipients, who are more likely to remain 
uncovered. People who seem likely to remain on welfare and en ^lled in the 
program will receive attention. This differs from progrm^s in which 
unweighted participation measures are used, where the participation of 
short-term welfare recipients '^counts,* even if they would have left 
welfare quickly without special services. With coverage measures, programs 
have lese incentive to serve only the most job-ready enrol lees, since a 
client can be counted only once as covered either through "participation^ 
or "placement.** 

Data for experimentals in the three programs studied illustrate how a 
c verage measure might work in practice. In these program^ studied, only 
from 10 to 20 percent of experimentals were still on welfare nine months 
after enrollment and had not begun employment, had not participated in any 
major component^ or had not been sanctioned for not participating, (At any 
point in time during the nine-s^nth period, nowev^r, the coverage rate 
would be lower.) This conveys a useful overall impression to legislators 
and the public about how a program is managing to work with its eligible 
caseload, in addition, because some tvo-thlrd5 of the ^'non^covered^ experi- 
mentals in the studied programs vere recipients, and three-quarters of this 
group had no prior earnings, a coverage standard for welfare employsient 
programs could shift attention to-^ard these more dependent subgroups. 



No short-tena perfonaance indicator is ideal. However, this analysis 
indicates that, in welfare enployTssnt prograna, measures should take 
account of differences in the welfare dependency and employability of the 
individuals served. In principle, any of several indicators, combined 
Judiciously with other information, can be used to measurs program 
perfonnanoe. This analysis suggests that weighted outcoae measures correct 
soae of the defects of coaaon unweighted measures. Coverage measures also 
hold proolse. The second phase of the research will provide additional 
information on choosing appropriate prograa measures and standards. 
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1. Resuits of th« full benefit-cost analysis are c^escribed in an 
intarnal worldng pap«r by the authors. The findings of this 
analysis generally parallel the impact findings. 

2* The use of the tktrm ''placeiaent* i^ avoided in this paper. The 
term vaa originally used by the eisployisent service to denote 
referral of a client to a particular job opening by program 
staff. It is therefore inappropriate for programs that *'ely 
on a client ^s owe Job search efforts. In addition, place- 
ments, or self-reported employiaentt tend to understate employ- 
ment and earnings because recipients sometiioes do not report 
Jobs to welfare staff* 

Similarly, the term ''off-welfare^ is used rather than ^case 
closure** because it is more inclusive. It covers persons who 
apply for AFDC, enter a program, but then quickly leave the 
welfare system without haviiig been approved for a grant (i-e*, 
without ever having had a case opened). 

'*0f f-welfare^ and ^welfare reduction*' indicators are not 
identical. The former looks only at whether families are 
receiving any AFDC payment, and it is stated as a numerical 
count or as a percent • The various welfare reduction foraulas 
in use subtract pre-program welfare grant levels for clients 
from their post-program welfare receipt to arrive at a dollar 
figure, either aggregate or per registrant. The first phase 
of this study tests an off-welfare indicator rather than a 
welfare dollar reduction indicator because the pre- program 
data necessary to simulate that indicator is lacking from the 
San Diego and Baltimore research data bases. 

3. The role of performance scores in the actual distribution of 
funds has been quite small. The bulk of federal WIN funds 
have baen allocated to states according to number of WIN regis- 
trants. On the basis of budget appropriations during the 
19703, it has been determined that incentive rewards for per-- 
formance based on this formula could amount to about one-third 
of all federal WIN moneys given to states. (See Office of 
Fajiily Assistance, 1985, pp. 13-1U.) 

In practice, annual funding changes have been rest* Icted in 
other ways. WIN regional coordinators have had ^^scretiona^y 
powers, and incentive moneys could be allocated for local 
perfonaanoe achievements not Incorporated in the matheisa tical 
formula or on the basis of other considerations. As a result, 
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only aoout 3 percent of fu.ids distributed in a given year have 
reflected performance scores, although cuaulatlve changes 
across the years could have amounted to sore. ( Ibid , p. 21.) 

Job retention hss been a more important determinant of the 
program performance score in the discretionary part of the WIN 
Allocation Fomauia than Job entry, although there is some evid- 
ence that the complexity of the formula kept this fact hidden 
from line operators (Mitchell, Chadwin, Nightingale, 198O, p. 
287). The relative potential of each element of the formula 
to raise a state's overall performance score differed, depend- 
ing on how high or low its score on each element sight be. 
The compleacity of the discretionary part of the formula was 
such that determining which ela^ents had the greatest influ- 
ence on scores would be very difficult without sophisticated 
analysis and simulation. 

Participation is observed now, whereas outcomes may be 
observed only after sc»ie months and may require substantial 
f^ffort in locating clients to ask about their employment 
status. Monitoring subgroup participation may be the most 
effective way of ensuring local compliance with an optimal 
targeting plan* 

The problem of specifying optimal performance standards for 
independent local service providers for JTPA programs has been 
highlighted by the growing use of fixed-priced contracting. 
The language of JTPA has encouraged the use of fixed-priced 
contracting because all costs incurred can be allocated to 
''training,'* thus helping programs to comply with the 15 per- 
cent cap on administration costs. For a thorough discussion 
of the possibilities and problems in fixed-priced contracting 
see Wallace y 1935* 

Indicators that make use of pre-program client characteristics 
are often refevr^<l to as change^based Indicators, with simple 
outcomes designated as level indicators* The example given in 
this chapter for San Diego would suggest that change- based 
indicators should prove superior to simple outcomes as proxies 
for real program impact. In that case, the change from no 
pre-program employment to employment dr.rlng the fol low-up 
period was associated wi th the larger program- induced impact 
on employment. The weighted job entry rates tested in this 
paper are change-based indicators, since they award more 
performance points for the employment of clients who were not 
employed in the recent pre-program period. 

The relevant literature on mdica tor val ida lion is based on 
several analyses of CETA. Borus, 1978, found that job entry 
had very little power to indicate net impact for CE':k. Gay 
and Borus, 1980, in a study of four pre-CETA programs, found 



change indicators zo be somewhat superior, and rated staple 
Job entry as one of the poorest measurea. In contrast, Geraci 
and King, 1981, found «rldence supporting job entry as the 
better measure, as did Geraci, 1984. Zornitsky et al., 1985, 
produce<i results favoring level indicators. The latter th.-ee 
studies also concluded that post-progran follow-up added 
valuable inforaation about eaployaent at the point of 
termination. 

These studies all suffer serious oethodological probleas from 
having been based on non-experiaental iapaot estimates. The 
principal issue — the value of level indicators versus 
change-based indicators is still the oost pressing one to 
be resolved in p«rfona«nce laoni toning. The issue is compli- 
cated by the possiblity that the best class of indicators may 
be different for welfare women, adult men and youths. Adult 
men entering employment programs typically exhibit a temporary 
pre-prograa dip in earnings, making prior earnings problematic 
as a proxy for earnings capability. Youth often have short 
and erratic earnings histories, and a pre-program earnings 
baseline may therefore b« meaningless for them. 

See Bane and Ellwood, 1983; Ellwood, 1986. 

See Ellwood, 1986, p. xii. 

The wait-and-see approach does not rely on an ability to pre- 
dict future dependency and does not face the political hurdle 
of denying services to subgroups based on marital status and 
age of youngest child. On the other hand, an initial period 
may have been wasted, a period in which improvements could 
have been made. Ellwood in his 1986 work suggests that 
evidence favors early identification and targeting over the 
wait-and-see strategy. 

See O'Neill et al. , ig84, p. 84. 

See MDRC, 198O. 



CHAPTSR ? 

See Goldmar et al., 1986; Friedlander et al. , 1985; and Rlccio 
et al. , 1986. Far a aumoary of the demonstration's findings 
thus far, see Gueron, 1987. 

In San Diego, a second experimental group received Job search 
only. The '-■rogram and its evaluation were also carried out 
for AFDC-D- I'either of these research groups is analyzed in 
this study. 
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3. In thx5 report, participation and sanctioning rates were 
calculated on scoevhat different bases than in the published 
stcte reports. In this study, the base is always ''all experl- 
mantals, " In the state reports, the base of ''al 1 program 
registrants'* was often used. Most experimentals did, however, 
register for the progrsjas, and the differences between the 
figures cited here and those published in the state reports 
are not large* 

4. Saaple sizes in this report differ slightly from those in the 
corresponding state reports. An attempt was aade here to 
assign values to detaographic data where these were missing. 
If aissing data could not be inferred with reasonable 
certainty, the cases were dropped from the analysis. The 
effect on sample size was the gain of 7 cases in San Diego and 
54 cases in Baltimore, but a loss of 32 cases in Virginia. 

5. Randomization prcducei similar experimental and control groups 
with, however, some differences. There were sosall differences 
between research grtups in ethnicity and marital status in the 
San Diego sample. In the other cwo samples, ctaall differences 
were apparent in measures of education, prior employment and 
earnings. 

6. This does not mean that thfi^ indicated subgroups account for 
the buLlf of jlH AFDC expenditures. Benefits paid to families 
outside of the WIN-oanadatory sample are not counted. 
Nationally, about two- thirds of AFDC families are WiN-exempt. 



1. For more complete reports of data quality control, see the 
individual state reports. 

2. For more detail about data sources and follow-up, consult the 
state reports. 

3« The distinction between unconditional and conditional impact 
estimates can be developed as follows* The basic impact 
regression model is 

I(T, SI, S2, X) 



where ^ 

Y outcome variable 



T experimental group dummy variable 

SI dummy variable for subgroup dimension 1 
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!■ 1 




S2 
X 



dummy variable for subgroup dimension 2 
vector of additonal control variables 



The full saajple impact ia the coefficient of T. The uncondi- 
tional subgroup aatiaates for SI coae fron the regression 
oodel 

YCTSI, TNS1, S1, S2, X) 

where 

TS1 a 7 » SI 

THS1 » T » d-SI) 

The impact on groups S1=1 and S1=0 are read from the 
coefficients of TS1 and TNS1 , respectively. Finally, the 
conditional model is 

y(T, TS1, TS2, SI, S2, X) 

vhere 

TS2 = T • S2 

and the coefficient of T is the inpact when S1=0 and S2=0. 
The coefficient of T31 is the additional iapact attributable 
to the SI characterstic when S2 is held constant. The 
coefficient of TS2 is the additior.ai impact attributable to 
the S2 characteristic when SI is held constant. 

Interactive specifications a 'e possible for both unconditional 
and eond'tional asodels. For the unconditional case, 

T(TS12, TS1N2. TSN12, TSN1N2, SI, S2, S12, X) 

where 



TS12 




T 


• S1 » S2 


TS1N2 




T 


•si * C1-S2} 


TSN12 




T 


• (1-S1) • S2 


TSN1N2 




T 


• (1-S1) • (1-S2) 


SI 2 




SI 


• S2 



For the conditional case, 
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Y(T, TS1, TS2, TS12, SI, S2, S12, X) 

Coefficients in this latter model can be combined to reproduce 
the unconditional interaction eatiaatas exactly. But when a 
third subgroup dimension is introduced, S3f the temi TS3 in 
the conditional oodei would make the two sets of interaction 
estimates different, 

S«e Borus, 1978* 

Individual iapsict estimates are made by (1) regressing 
demographic and background characteristics on emplojrment and 
welfare outcomes for the experimental and control groups , and 
then (2) using the coefficients obtained from these 
regressions, along with the characteristics of individual 
members of the experimental group, to predict individual 
Impacts^ The first stage estimate is made from the condi- 
tional subgroup impact regression model- That is, from the 
regression that contains the full array of experimental 
subgroup interactions, a prediction is made forr the expected 
program Impact on earnings and welfare receipt for each person 
in the experimental sample* The net impact estimate will 
differ for each person, depending on the demographic, and 
prior work and welfare characteristics at the time of entry 
into the research sample. 

These are sometimes r^ferr^ to as _dj>r? ( ; t ^yt^tmateg. For 
example, with treatment interactions for prior employment, 
education and number of children^ one impact would be 
predicted for an experimental with no prior employment, no 
diploma, one child; a different net impact would be predicted 
for an experimental with any difference in any of these 
characteristics* The more variance in the dependent variable 
that can be accounted for by the regression model, the batter 
the predicted net impacts. At the present state of kmowledge, 
however, most of the variation in the outcome measures cannot 
be explained. 



The applicant/recipient distinction is often a significant one 
for prcgraxp operators^ as it was In San Diego. Also, F-te$t5 
for homogeneity of regression coefficients have consistently 
turned up large differences m regression models for appli- 
cants and recipients in welfare receipt equations^ For these 
reasons, and to more easily handle expected differences in 
applicant and recipient behavior, the samples were split for 
the regression runs. 
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2. Tha cosapoalte estimates an» a weighted average of estl^aates of 
the impacts and adju3ted iseans for the five estiaation sub- 
saisplea. The weights are the inverse estinmted standard error 
for each iapact estimate^ nonnallsad by dividing by the sum of 
the inverse standard errors. The choice of weights miniaises 
the variance of the cooposlte estimate, satisfying one of the 
objectives of pooling. Another choice of weights could have 
been the fraction of the total sample accounted for by each of 
the five estimation samples. But the designs in San Diego and 
Virginia are unbalanced, with about a 2:1 experiiuental -control 
ratio, and the interpretation of such a weighting scheme is 
not clear. A final alternative would have been to weight each 
sample by the fraction of all wcrlc/welfare program enrol lees 
in the countn who are in prcgrms similar to each of the 
three under study here, an endeavor beyond the scope of this 
paper. 

3. For this analysis, impact regressions were run on the pooled 
sample of applicants and recipients, first in Baltimore and 
then in Virginia. The model specified an experimental group 
dimsmy, a dummy for applicants, and a dummy for an e3spe**i- 
mental-applicant interaction. This last dummy gave the 
estimate of the unconditional impact difference. Interactions 
of experimental group membership with all other subgroup 
characteristics were then added and the same coefficient read 
again* The t-statistic for this coefficient therefore gives 
the statistical significance of the conditional difference in 
impacts between applicants and recipients. Applicant/reci- 
pient differences in earnings gains were statistically 
significant in Baltimore but not in Virginia. 

A dependency index was created as follows. Average earnings 
and average AFDC dollars received were regressed on demo- 
graphic variables for control group clients in Virginia* 
These coefficients were then used to predict follow-up earn- 
ings and welfare benefits for sample members in *San Diego, 
The index variable was created as predicted earnings minus 
predicted welfare. An earnings impact regression was then run 
for San Diego using linear through quartic terms in the index 
and linear through quartic terms in the interaction of the 
experimental group dummy with the index, plus the experimental 
group dummy itself. This dummy and the four interaction 
coefficients were then used to plot predicted impacts at 
5-percent n-tile points of the index variable. The procedure 
was repeated for Baltimore, 

5. The negative earnings impacts for the subgroup with some year-- 
prior earnings may indicate that the longer-tenn ecnployabili ty 
activities for welfare recipients with an employment record 
keep such persons out of the labor market when they would have 
been working* It seems likely, however, that a major part of 

* 
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the negative differential is anomalous, a product of chance. 
The recipient control group in this prior earning3 ca^.egcry 
had higher earnings than the corresponding applicant controls, 
whereas all the other recipient control subgroups earn less 
than their applicant control counterparts. This suggests that 
the true earnings losses of this recipient subgroup sight not 
be as severe if the experiment were to be replica ted • 

6. AFDC benefit levels also vary across counties in Virginia. 

7, Se« Ricoio et al., 1986, p* xiv. 



CHAPTER S 

1. Onder-'reporting of Job entries can occur when case heads who 
leave welfare because they have found Jobs do not report 
eaployfljent. Particularly in large urban areas with large 
caseloads, cases are often closed because the client fails to 
respond to soiae atteopt at contact, making it impossible to 
record eajployoent status or other eligibility factors. In 
addition, reports of employment obtained by Incooe maintenance 
staff for the purpose of adjusting grant payments are not 
always reported back to the staff of the employment program. 

2. Regressions for average earnings and average welfare payinents 
over quarter U through the last quarter were run with all 
treatment^subgroup interactions in the model at once. The 
coefficients of these interactions were then used to predict 
for every experimental group member the expected net impact on 
earnings and welfare receipt. These new variables were then 
correlated with emplosrment and off-welfare status, using only 
the experimental group sample. 

3» These weights represent approximately the relationship of 
control group mean earnings for prior-earnings categories in 
the composite impact table in the preceding chapter. 
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